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Abstract 

The normality measure M has been introduced by Mauduit and Sarkozy in order to 
describe the pseudorandomness properties of finite binary sequences. Alon, Kohayakawa, 
Mauduit, Moreira and Rodl proved that the minimal possible value of the normality 
measure of an iV-element binary sequence satisfies 



+ 0(1) l0g2iV< 



B]ve{o.i}" 



N{En) < 3Ai/3(log7V)2/3 



for sufficiently large N. In the present paper we improve the upper bound to c(log-/V)^ for 
some constant c, by this means solving the problem of the asymptotic order of the minimal 
value of the normality measure up to a logarithmic factor, and disproving a conjecture of 
Alon et al.. The proof is based on relating the normality measure of binary sequences to 
the discrepancy of normal numbers in base 2. 

1 Introduction and statement of results 

Let a finite binary sequence Ej^ = (ei, . . . , cat) G {0, 1}^ be given. For A; > 1, M > 1 and 
X e {0,1}'^, we set 



T{En, M,X) = #{n : 0<n< M, and (e 



71+1) 



X}, 



which means that T{Ej^,M,X) counts the number of occurrences of the pattern X among 
the first M + k elements of £^jv- The normality measure J\f{Ei\j) is defined as 



M{E]\]) = max max max 

l<fc<log2Af Xg{0,l}'= l<M<Af+l-fc 



T{En,M,X) 



M 
2^ 



(1) 



The normality measure has been introduced in 1997 by Mauduit and Sarkozy [T7], together 
with several other measures of pseudorandomness for finite binary sequence^. In two papers, 
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^Strictly speaking, Mauduit and Sarkozy defined their pseudorandomness measures for sequences over the 
alphabet { — 1, 1} (instead of {0, 1}, as in the present paper). However, in the case of the normality measure 
the numerical values of the digits e„ are of no significance whatsoever, since they are used as mere symbols. 
In the present paper, it is more convenient for our purpose to study sequences defined over the alphabet {0, 1} 
(since they can be related to the binary representation of real numbers), and the definitions have been modified 
accordingly. 
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Alon, Kohayakawa, Mauduit, Moreira and Rodl [21 [3] studied the minimal and the typical 
values of the normahty measure (and other measures of pseudorandomness). Concerning the 
typical value of M, they proved that for any e > there exist 5i,52 > such that for Ej\f 
uniformly distributed in {0, 1}^ 

6iVn <M{En) < 

holds with probability at least 1 — e for sufficiently large A^, and conjectured that a limit 
distribution of 

N{Em) 

exists; the latter has been recently confirmed [1]. Concerning the minimal value of AA, Alon 
et al. proved that 

(l + o(l) ) log2 N < min AA(^^) < 3N^/^{log Nf'^ (2) 
V2 J En€{o,i}^ 

for sufficiently large A^. The lower bound in ^ is based on a relatively simple combinatorial 
argument. The proof of the upper bound in ([2]) is rather elaborate; however, it is entirely 
constructive, using an explicit algebraic construction based on finite fields. Concerning an 
possible improvement of ([2]), Alon et al. write in [2j 

"We suspect that the logarithmic lower bound in [equation ^] is far from the 
truth." 

and formulate the open problem 

"Is there an absolute constant a > for which we have 

minAAf^Tv) > A^" 

En 

for all large enough N?" 
In [3] they write 

"The authors believe that the answer to [the open problem above] is positive." 

The purpose of the present paper is to close the gap between the lower and upper bound in ([2]) , 
and settle the problem asking for the asymptotic order of the minimal normality measure of 
binary sequences, up to a logarithmic factor. More precisely, we will prove that 

min N{EM) = 0{{\ogNf), (3) 
Eive{o,i}'v 

by this means giving a negative answer of the problem of Alon et al. and disproving their 
conjecture. 

Theorem 1. There exists a constant c such that 

min TVf^jv) < c(logA^)^ 
Eive{o,i}^ 

for sufficiently large N. 
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The key ingredient of the proof of Theorem [T] is to relate the problem asking for binary se- 
quences having small normality measure to the problem asking for normal numbers having 
small discrepancy. We will describe definitions, basic properties and important results con- 
cerning normal numbers in Section [5] below; the proof of Theorem [T] will be given subsequently 
in Section[3l The proof of Theorem[T]is constructive, providing a more or less explicit example 
of a sequence satisfying the upper bound in the theorem. 



2 Normal numbers 

Normal numbers have been introduced by Borel [6] in 1909. Let z G [0, 1) be a real number, 
and denote its binary expansion by 

Z = O.Z1Z2Z3 

Then z is called a normal number (in base 2, which is the only base that we are interested 
in in the present paper) if for any k > 1 and any block of digits X G {0, 1}*^ the relative 
asymptotic frequency of the number of appearances of X in the binary expansion of z is 
2~^. Using the terminology from the previous section and writing Zn = (zi, . . . , zn) for the 
sequence of the first digits of z, this can be expressed as 

j.^^ T{ZN,N + l-k,X) ^ 

AT^oo N ' 

where k is the length of X. Borel proved that almost all numbers (in the sense of Lebesgue 
measure) are normaH. There exist many constructions of normal numbers, the first of them 
being obtained by concatenating the digital representations of the positive integers (Champer- 
nowne [7j, 1933), primes (Copeland and Erdos [8j, 1946) and values of polynomials (Davenport 
and Erdos [3, 1952). Deciding whether a given real number is normal or not is a very difficult 
problem, and it is unknown whether constants such as \/2 , e and tt are normal or not (see [4j ) . 

In an informal way, normal numbers (or the corresponding infinite sequences of digits) are 
often considered as numbers showing "random" behavior (which is justified by the aforemen- 
tioned theorem of Borel). In fact, different variants of the normality property have been 
considered as a test for pseudorandomness of (infinite) sequences of digits, for example in the 
monograph of Knuth pr3j on The Art of Computer Programming, and the normality measure 
of Mauduit and Sarkozy is a quantitative version of such a pseudorandomness test for the case 
of a finite sequence of digits. For a discussion of the connection between normal numbers, 
pseudorandomness of (finite) sequences, and pseudorandom number generators, see the book 
of Knuth and the papers of Mauduit and Sarkozy On finite pseudorandom binary sequences 
I- VII, as well as lai^. 

To proceed further, we need some notation. A sequence of real numbers (yn)n>i from the unit 
interval is called uniformly distributed modulo one (u.d. mod 1) if for all intervals [a, b) C [0, 1) 
the limit relation 

1 ^ 

71=1 



■^This is the first ever appearance of wiiat we cail today tfie strong law of large numbers, for tiie special case 
of tlie i.i.d. system of tlie Rademacher functions on the unit interval. 
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holds. The quahty of the uniform distribution of a sequence is measured in terms of the 
discrepancy Djy, which for > 1 is defined as 

1 ^ 



DN{yi,...,yN)= sup 

0<a<6<l 



n=l 



A sequence is u.d. mod 1 if and only if its discrepancy tends to zero as N ^ oo. 

By an observation of Wall [211, a number z is normal (in base 2) if and only if the sequence 

where (•) denotes the fractional part, is u.d. mod 1. Equivalently, z is normal if and only if 

Dn{z,{2z),...,{2^-'^z)) ^0 as ^ oo. 

Korobov [H] posed the problem of finding a function il^{N) with maximal decay for which 
there exists a number z such that 

Dn {z, (2z), . . . , (2^-^z)) < i;{N), N > 1. 

The best results concerning this question is currently due to Levin |16J, who proved (con- 
structively, by giving an explicit example) the existence of a z for which 

(z, (2z), . . . , (2^-^z)) = O ( ^^J,^/ ) as iV ^ oo. (4) 



This result should be compared with a lower bound of Schmidt [20j, stating that for any 
sequence (2/„)n>i 

DN[yi, yN) > Cabs ^ ■ 

Thus Korobov's problem is solved, up to a logarithmic factor. It is also interesting to com- 
pare dH) with the "typical" discrepancy of a normal number: for almost all z G [0, 1), 



(Fukuyama 



^/]VZ)jv(^,(22),...,(2^-iz)) 2V2I 

lim sup = = 

N^oo Vlog log N 9 



For more information on normal numbers we refer to |12l I19j . for an introduction to uniform 
distribution and discrepancy theory to [ini [IS] . 

The main tool in the proof of Theorem [1] is the following lemma. 
Lemma 1. Let z £ [0,1) be a real number, whose binary expansion is given by 

Z = O.Z1Z2Z3 . . . , 

and assume that there exists a nondecreasing function ^{N) such that 

Dn{z,{2z),...,{2^-'z))<^, N>1. (5) 
Then for each N > 1 the binary sequence Z]<s = (zi, . . . , zn) satisfies 

M{Zn) < $(iV). 

In view of Levin's result (jH), Theorem [1] is a direct consequence of the lemma. 
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3 Proof of Theorem [T]. 

By the previous remark, to establish Theorem [T] it remains to prove Lemma [TJ Let a number 
N be fixed, and assume that 



(z,(2z),...,(2*^-iz)) < 



^>(M) 
M 



holds for 1 < M < A'^. To prove J\f{Z]^) < ^{N) we have to show that for any values of k, X 
and M satisfying 1 < A; < logs X e {0, l}'^ and 1 < M < iV - A; + 1 we have 



T{Zn,M,X) 



M 
2F 



(6) 



Let k, X and M satisfying these assumptions be fixed and write X = (xi, . . . ,Xk), where 
xi, . . . ,Xk G {0, 1}. By definition, 

T{Zn,M,X) = #{n : < n < M, and {zn+i, ... ,Zn+k) = {xi, ... ,Xk)} . 

To X we can assign an interval Ix by setting 



Ix 



Then Ix is a half-open interval of length 2 The following observation is the crucial point 
of the proof of the lemma. We have 



if and only if 
In fact, we have 



(Zn+l, . . . ,Zn+k) = (Xl, ...,Xk) 

{Tz) G Ix. 



{T^Z) = O.Zn+lZn+2 • • • , 

and for any number y G [0, 1) the relation y & Ix holds if and only if the first k digits of y 
coincide with (xi, . . . , x^). 



Consequently, we have 

T{Z^,M,X) = j;i,,((2"z)) = 5^1,,((2"-iz)) 



M-l 



M 



n=0 n=l 

Now by the assumption on the discrepancy of z we have 

M 



^j:i,,((2"-iz))-i 



n=l 



< 



$(M) 
M ' 



(7) 



(8) 



and consequently, multiplying equation ([8]) by M and using ([7]), we obtain 

M 



T{Zn,M,X)-^ 



< $(M). 



Since by assumption the function ^{M) is nondecreasing, this establishes ([5]), which proves 
Lemma [TJ 
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